Using Sacks to Organize Registers in VLIW Machines

نویسندگان

Josep Llosa

Mateo Valero

José A. B. Fortes

Eduard Ayguadé

چکیده

This paper analyses the register requirements of software pipelined inner loops. When the number of functional units and/or the number of stages of individual functional units is increased, the number of registers required may be prohibitive in chip area and cycle time. We characterize lifetime of values in pipelined loops with their loop register locality (LRL). Based on this characteristic, we propose a new organization of the register le in order not to aaect cycle time and also reduce area, while increasing the number of registers. This can be useful to minimize the frequency of spill at a reasonable cost. The spill code can increase the minimum initiation interval and decrease loop performance. This new organization consists of a small high bandwidth multiported register le and a low bandwidth port-limited register le called sack. A mechanism to assign values to the sack is presented. We demonstrate the eeectiveness of our approach by experimenting with a collection of loops from the Perfect Club benchmark suite. Experiments in order to nd the optimal number of registers into the sack have been done. We also measured the eeect of the spill code on loop performance.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improving DTSVLIW Performance via Block Compaction

 Dynamically Trace Scheduled VLIW (DTSVLIW) machines have two execution engines and two instruction caches: a Scheduler Engine and a VLIW Engine, and an Instruction Cache and a VLIW Cache. The Scheduler Engine fetches instructions from the Instruction Cache and executes them singly, the first time, using a simple pipelined processor. In addition, it dynamically schedules the instruction trace ...

متن کامل

Method and apparatus for the selective scoreboarding of computation results

Statically scheduled machines do have a disadvantage when dealing with dynamic events, such as cache hit or miss detection. Early VLIW machines were designed without caches, to achieve predictability in memory access. However, such designs suffer in memory performance. To achieve high performance, VLIW architectures must have adequate support for using caches. A simple VLIW design might use an ...

متن کامل

Machine-Description Driven Compilers for EPIC and VLIW Processors

In the past, due to the restricted gate count available on an inexpensive chip, embedded DSPs have had limited parallelism, few registers and irregular, incomplete interconnectivity. More recently, with increasing levels of integration, embedded VLIW processors have started to appear. Such processors typically have higher levels of instruction-level parallelism, more registers, and a relatively...

متن کامل

An Operation Rearrangement Technique for Low-Power VLIW Instruction Fetch

As mobile applications are required to handle more computing-intensive tasks, many mobile devices are designed using VLIW processors for high performance. In VLIW machines where a single instruction contains multiple operations, the power consumption during instruction fetches varies signi cantly depending on how the operations are arranged within the instruction. In this paper, we describe a p...

متن کامل

DTSVLIW: VLIW Performance with Sequential Code

 Due to the temporal execution locality present in programs, even small instruction caches (16-Kbyte) can provide processors with fast access to instructions most of the time. The Dynamically Trace Scheduled VLIW (DTSVLIW) architecture exploits programs’ temporal execution locality by executing code in two distinct modes. In the first execution encounter, fragments of the code are executed in ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 1994

Using Sacks to Organize Registers in VLIW Machines

نویسندگان

چکیده

منابع مشابه

Improving DTSVLIW Performance via Block Compaction

Method and apparatus for the selective scoreboarding of computation results

Machine-Description Driven Compilers for EPIC and VLIW Processors

An Operation Rearrangement Technique for Low-Power VLIW Instruction Fetch

DTSVLIW: VLIW Performance with Sequential Code

عنوان ژورنال:

اشتراک گذاری